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ROGRAM  FOR  ESTIMATING  UNCERTAINTIES  IN  QUANTILE 
ESTIMATES  DERIVED  FROM  EMPIRICAL  PEARSON  FITS 


§1.  Overview 

The  attached  program  assesses  the  variability  in  the  estimate  of  an  upper  percentage  point  of 
a  Pearson  curve  when  its  moments  are  estimated  fr  m  data.  This  overview  only  aims  to  describe 
its  use  in  broadest  terms.  Section  2  describes  the  theory  and  Section  3  the  program  in  detail.  A 
Fortran  listing  follows  in  an  Appendix. 

The  program  assumes  that  a  sample  of  size  nsamp  has  been  drawn  from  a  Pearson  type  IV 
distribution  (Pearson  type  VII  -  the  student  t  family  -  is  also  included,  but  the  Gaussian  law  is  not). 
On  the  basis  of  this  sample  it  is  assumed  that  the  first  four  raw  moments  have  been  empirically 
estimated  by  the  method  of  moments.  It  is  desired  to  estimate  an  upper  quantile  .  It  is  assumed 
that  this  is  done  by  solving  for  the  quantile  of  the  Pearson  curve  with  the  same  first  four  moments 
as  the  sample.  Let  us  call  the  estimate  obtained  in  this  way  zO.  The  purpose  of  the  program  is  to 
numerically  assess  the  uncertainty  in  xO  that  is  propagated  from  the  uncertainty  in  the  empirical 
moment  estimates  based  on  the  sample  of  size  nsamp.  This  calculation  is  done  via  the  “delta 
method*  and  hence  is  based  on  a  simple  asymptotic  Taylor  series  approximation  that  is  discussed 
in  the  attached  documentation. 

The  program  is  operated  as  follows  :  it  contains  a  main  program  quantile.f  and  subroutine 
files  subzero. f  and  subp4df.f.  Suppose  that  the  program  is  compiled  into  an  executable  program 
“quantile*.  On  the  Stanford  Statistics  department  VAX750/UNIX4.2bsd,  the  program  might  be 
run  as  follows: 

quantile  <input  >output 

This  specifies  that  the  input  data  (discussed  below)  is  read  in  from  the  data  file  "input*  and 
the  results  written  on  an  output  file  "output*  The  output  has  been  formatted  so  that  it  can  then  be 
read  into  an  appropriate  statistical  package  (S,  from  Bell  Labs,  is  what  I  had  in  mind  )  for  further 
analysis. 

Here  is  a  toy  example  of  an  input  file  : 


n 


0 

1 

.2 

3.9 

100 

.05 

0 

1 

.2 

3.9 

100 

.005 

0 

1 

.2 

3.9 

100 

.0005 

0 

1 

.2 

3.9 

100 

.00005 

The  first  line  contains  a  character  indicating  whether  long  output  is  desired  (y  for  yes  ,  n  f<  r 
no)  :  the  long  output  of  intermediate  calculations,  described  later,  that  is  mostly  of  interest  during 
debugging.  Subsequent  lines  contain  six  columns  entered  in  free  (ie  Fortran  list  -  directed  )  format: 


2 


Cols  1  -  4:/i,  <r2,  01,  02 

—  mean,  var,  skewness,  kurtosis  of  generating  distribution 
5:  nsarap  =  number  of  observations  taken 
6:  eps  as  percentile  that  is  to  be  estimated 

Here  is  the  output  file  produced  from  the  above  input  : 
long  output? 


mean 

var 

betal 

beta2 

nsamp 

eps 

xO 

dens 

s.d. 

0. 

1.00 

0.20 

3.90 

100. 

0.500e-l 

1.73 

0.953e-l 

0.189 

0. 

1.00 

0.20 

3.90 

100. 

0.500e-2 

3.12 

0.155e-l 

0.416 

0. 

1.00 

0.20 

3.90 

100. 

O.SOOe-3 

4.54 

0.234e-2 

0.661 

0. 

1.00 

0.20 

3.90 

100. 

0.500e-4 

6.08 

0.348e-3 

0.775 

The  first  line  is  a  prompt  which  is  useful  only  if  input  and  output  are  occurring  interactively 
at  a  terminal.  The  second  line  describes  the  contents  of  the  columns  :  in  addition  to  the  input 
variables,  we  have 

zO  :  the  upper  eps  -  th  percentile  corresponding  to  mean, var,  betal  and  beta2 
dens  :  the  value  of  the  Pearson  curve  density  when  evaluated  at  xO 

s.d.  :  the  approximate  standard  deviation  of  the  estimate  of  xO  that  is  obtained  by  using  the 
sample  moments  based  on  a  sample  of  size  mump 

By  selecting  just  the  numerical  output  (or  a  subset  of  the  columns)  ,  further  graphing  of  the 
output,  statistical  analysis,  etc.  is  possible. 

It  should  be  remarked  that  the  above  program  has  NOT  been  fully  debugged  and  checked  . 
Obvious  blunders  have  been  removed  and  some  tests  that  parameters  lie  within  the  Type  IV/ VII 
ranges  and  have  sufficiently  many  moments  have  been  built  in.  No  claims  are  made  for  the  quality 
of  the  code  or  algorithms.  However  the  program  should  be  useful  for  exploration  of  the  uncertainty 
in'  ed  in  empirical  determination  of  threshholds. 

Many  authors  have  warned  of  the  hazards  involved  in  fitting  Pearson  curves  on  the  basis  of 
empirically  determined  moments.  This  program  offers  the  capability  for  assessing  (  subject  to  the 
validity  of  the  delta  method  approximation,  of  course  !)  quantitatively  the  size  of  these  hazards. 

As  a  further  illustration  of  the  output  of  the  program,  the  following  experiment  was  run.  For 


certain  pairs  of  {fa,  fa)  falling  in  the  Type  IV  (or  VII)  region 

Case  1  2  3  4  5 


fa  0  0  112 

fa  3.3  4  5.25  6  7.5 


Table  1. 

and  sample  size  nsamp  =  100,  the  uncertainty  in  estimating  xO  for  e  =  .05,  .01,  .005,  .001,  .0005, 
.0001,  .00005,  .00001  is  calculated.  The  two  cases  with  zero  skewness  are  scaled  versions  of  the  t 
distribution  on  24  and  10  d.f.  respectively. 

The  raw  output  is  shown  in  Table  2.  One  interesting  plot  is  to  look  at  the  coefficient  of  variation 
of  the  estimated  percentile,  namely  s.d.(z0/z0)  These  values  are  plotted  against  —log  10(e)  in  the 
upper  box  of  Figure  1,  with  the  plotting  character  indicating  the  case  number.  Clearly  as  the 
skewness  and  kurtosis  increase,  so  the  coefficients  of  vaiuition  become  increasingly  unstable  the 
further  out  one  goes  into  the  tail  of  the  distribution.  A  second  plot  of  the  square  roots  of  the 
coefficients  of  variation  is  shown  for  comparison. 

Note  that  in  the  delta  method  calculations,  the  effect  of  increasing  sample  size  nsamp  is  simply 
to  reduce  the  s.d.'s  and  coeff.  of  variation  of  x,(0)  by  Jnsamp. 


alues 


Table  2.  Output  from  program  for  Input  parameters  given  in 
Table  1,  together  with  values  of  nsamp  and  z 
shown  in  columns  5  and  6. 


mean 

var 

betal 

beta2 

nsamp 

eps 

xO 

dens 

0. 

1.00 

0. 

3.30 

100. 

0 . 500e-l 

1.64 

0.133 

0.  - 

1.00 

0. 

3.30 

100. 

0.100e-l 

2.39 

0 . 577e-l 

0. 

1.00 

0. 

3.30 

100. 

0. 500e-2 

2.68 

0 . 396e-l 

0. 

1.00 

0. 

3.30 

100. 

0.100e-2 

3.32 

0 . 161e-l 

0. 

1.00 

0. 

3.30 

100. 

0 . 500e-3 

3.59 

0 . 109e-l 

0. 

1.00 

0. 

3.30 

100. 

0.100e-3 

4.20 

0 . 430e-2 

0. 

1.00 

0. 

3.30 

100. 

0. 500e-4 

4.46 

0 . 287e-2 

0. 

1.00 

0. 

3.30 

100. 

0.100e-4 

5.06 

0. 112e-2 

0. 

1.00 

0. 

4.00 

100. 

0 . 500e-l 

1.62 

0.105 

0. 

1.00 

0. 

4.00 

100. 

0.100e-l 

2.47 

0. 283e-l 

0. 

1.00 

0. 

4.00 

100. 

0 . 500e-2 

2.83 

0 . 156e-l 

0. 

1.00 

0. 

4.00 

100. 

0.100e-2 

3.71 

0 . 380e-2 

0. 

1.00 

0. 

4.00 

100. 

0 . 500e-3 

4.10 

0.205e-2 

0. 

1.00 

0. 

4.00 

100. 

0.100e-3 

5.09 

0.481e-3 

0. 

1.00 

0. 

4.00 

100. 

0 . 500e-4 

5.56 

0.256e-3 

0. 

1.00 

0. 

4.00 

100. 

0.100e-4 

6.73 

0. 590e-4 

0. 

1.00 

1.00 

5.25 

100. 

0. 500e-l 

1.82 

0. 590e-l 

0. 

1.00 

1.00 

5.25 

100. 

0.100e-l 

3.03 

0 . 107e-l 

0. 

1.00 

1.00 

5.25 

100. 

0. 500e-2 

3.57 

0.516e-2 

0. 

1.00 

1.00 

5.25 

100. 

0.100e-2 

4.86 

0.994e-3 

0. 

1.00 

1.00 

5.25 

100. 

0. 500e-3 

5.45 

0 . 497e-3 

0. 

1.00 

1.00 

5.25 

100. 

0.100e-3 

6.91 

0 . 103e-3 

0. 

1.00 

1.00 

5.25 

100. 

0. 500e-4 

7.58 

0.532e-4 

0. 

1.00 

1.00 

5.25 

100. 

0.100e-4 

9.26 

0 . 118e-4 

0. 

1.00 

1.00 

6.00 

100. 

0.500e-l 

1.78 

0.693e-l 

0. 

1.00 

1.00 

6.00 

100. 

0.100e-l 

3.03 

0.139e-l 

0  . 

1.00 

1.00 

6.00 

100. 

0. 500e-2 

3.60 

0.689e-2 

0. 

1.00 

1.00 

6.00 

100. 

0.100e-2 

5.07 

0 . 135e-2 

0. 

1.00 

1.00 

6.00 

100. 

0 . 500e-3 

5.76 

0.668e-3 

0. 

1.00 

1.00 

6.00 

100. 

0.100e-3 

7.57 

0.132e-3 

0. 

1.00 

1.00 

6.00 

100. 

0. 500e-4 

8.45 

0 . 661e-4 

0. 

1.00 

1.00 

6.00 

100. 

0.100e-4 

10.7 

0 . 133e-4 

0. 

1.00 

2.00 

7.50 

100. 

0.500e-l 

1.84 

0 . 414e-l 

0. 

1.00 

2.00 

7.50 

100. 

0.100e-l 

3.24 

0. 542e-2 

0. 

1.00 

2.00 

7.50 

100. 

0 . 500e-2 

3.88 

0.234e-2 

0. 

1.00 

2.00 

7.50 

100. 

0.100e-2 

5.52 

0 . 356e-3 

0. 

1.00 

2.00 

7.50 

100. 

0. 500e-3 

6.30 

0 . 163e-3 

0. 

1.00 

2.00 

7.50 

100. 

0.100e-3 

8.31 

0. 279e-4 

0. 

1.00 

2.00 

7.50 

100. 

0 . 500e-4 

9.27 

0 . 133e-4 

0. 

1.00 

2.00 

7.50 

100. 

0.100e-4 

11.8 

0.246e-5 

d. 

0.117 
0 . 980e~ 
0 . 964e~ 
0 . 949e~ 


0 . 934e-l 

0 . 875e-l 

■■■  s-v/.;- 

0 . 843e~l 
0.764e-l 

ft*# 

0.148 

•‘.'VV-aV- 

0.257 

.•  j 

0.357 

•  ,  t  ,  i  < 

0.613 

0.720 

■ .■•'  •" 

0.947 

1.04 

.•W.V. 

1.24 

M  i 

0.352 

0.779 

•>»» 

1.17 

2.62 

3.48 

6.01 

* 

7.33 

•*.  a  ■>. 

10.9 

0.827 

.- r-/ 

0.675 

1.36 

4.35 

-l->4 V.V.V 

6.04 

-’.V-'J 

10.5 

12.7 

17.9 

.  ^  il  iUr«t.T  v.' 

1.11 

1.67 

2.83 

.*  •  *  k  *  »  *  —  4  •  * 

10.9 

17.1 

38.8 

51.7 

91.2 

z.mrrm 

3 

’.7.VVUV1 

§2.  Sensitivity  of  Determination  of  Percentage  Points  of  Pearson  Curves  when  mo¬ 
ments  are  estimated  (Theoretical  documentation  to  accomp.viy  program.) 

Suppose  that  wish  to  estimate  the  percentage  point  xt,  which  for  given  s  solves  th  equation 

(1)  P(X  >  x)  =  s. 

Suppose  that  X  is  believed  to  follow  a  distribution  within  the  Pearson  family.  Then  the  equation 
assumes  the  form 

(2)  m*)  =  « 

where  the  vector  d  contains  the  parameters  specifying  the  precise  form  of  the  distribution  function 
F  (for  example  the  first  four  theoretical  moments).  If  d  is  known,  then  equation  (2)  can  be  solved 
to  yield  the  desired  percentage  point  z,(d).  Such  solutions  have  of  course  been  tabulated  for  a 
limited  selection  of  e  and  d  in  the  Pearson-Hartley  Biometrika  tables. 

There  is  sometimes  interest  in  situations  in  which  e  is  so  small  that  the  Biometrika  tables  are 
useless.  The  purpose  of  this  discussion  is  to  attempt  a  sensitivity  analysis  of  the  following  types: 
how  does  an  error  in  specification  of  d  propagate  into  the  resulting  percentage  point  x,(d)?  If 
d  is  estimated  with  a  given  standard  deviation,  what  is  the  corresponding  standard  deviation  of 
x#(d)?  The  analysis  will  be  simple  and  approximate,  based  on  'ie  so-called  “delta-method”  (Rao 
(1973,  §6g)). 

To  clarify  the  derivation,  we  regard  e  as  fixed  and  introduce  the  notation  /»(/?)  for  x,(d)-  Thus 
the  function  h  is  impli  /'y  defined  by  the  equation 

(3)  F{0,h(f}))  =  :. 

The  delta-method  approximation  simply  says  that  if  d  is  a  d  x  1  random  vector  with  mean  £"d  =  d. 
then 

(4)  Var^Md)  =  [Z?M^]‘[Var,dJ[£>A(^)] 

Here  Dh  is  a  d  x  1  vector  of  partial  derivatives,  and  Var^d  is  a  d  x  d  matrix.  Now  the  implicit 
function  theorem  says  (assuming  necessary  regularity  condition  )  that 


(5) 


Dh(0)  =  -\D7F[$MtT'\DxF(iiMm 


where  D\F  and  D^F  denote  partial  derivatives  with  respect  to  the  first  and  second  arguments  of 
F,  and  are  hence  lxl  and  d  x  1  vectors  respectively.  Writing  f(0,x)  =  jgF{0,x)  for  the  density 
corresponding  to  F,  (S)  reduces  to 


(6) 

so  that  (4)  becomes 


Dh(0)  = 


-DiF(0,h(0)) 

fWMfi)) 


(7)  V&rph(8)  =  ■~[DiF,][Va.rp8][DiF\  evaluated  at  (0,h(0)). 

In  our  example,  the  vector  0  of  parameters  describing  the  Pearson  curve  is  taken  to  be  0  = 
(pi,<x2 , 0i,  0i).  We  assume  that  these  will  be  estimated  from  i.i.d.  data  Xi,...,Xn,  which  have 
been  summarized  through  the  first  four  raw  moments  ft'k  =  k  =  1,2, 3, 4.  (Central 

moments  could  equally  well  be  used  :  raw  moments  are  taken  for  simplicity  of  calculation  here  : 
the  distinction  will  not  affect  the  final  result).  Of  couse,  we  have 


(8) 


so  that  if  we  estimate  0  by  0  =  5(p),  then  the  delta  method  estimate  of  the  variance  of  0  is 


(9)  Var(/7)  =  [Z?SjVar(/i)[Z?$]* 

where 

/I  0  0  0  \ 

0  1  0  0 

0  ~3pl/p*  2  fis/fil  0 

VO  -2/j 0  \/nl ) 

It  will  be  easier  to  work  with  the  raw  sample  moments  {/;*},  which  are  related  to  the  central 
sample  moments  by 


(Pl\ 

(  Mi  \ 

(11) 

p  = 

Pi 

PS 

=  h(m')  = 

I  Pi  ~  P? 

i  Ps  ~  3p2p\,+2p'? 

V  p\  -  *Pzp\  +  Sp'iPx  -  3p\*  / 

\pj 

Again  applying  the  delta  method,  we  arrive  at 


(10) 


(12) 


Var(^)  =  [0Sl(0ff|Var(£')[Z?ffp[0$J'. 


1*1 


V-Vs  +  -  12/if  erf  -Vi  1/ 

Finally,  we  note  that  the  entries  in  Var  (£')  are  obtained  from 
covVfcV*.)  =  ^EECov(x‘’x‘;) 

i  ■' 

(14)  =  -Cov(ATf,Xi  )  =  “  PkPk'Y 

Since  k  €  (1,2, 3, 4},  this  would  appear  to  require  us  to  know  the  raw  moments  of  the  appropriate 
Pearson  curve  up  through  order  8. 

§§2.1  Higher  moments  of  the  Pearson  Type  IV  family: 

Let  i  denote  the  ‘independent  variable’  with  origin  referred  to  the  mean  of  the  distribution. 
Then  the  coefficients  in  the  Pearson  curve  equation 

(151  1  df  -  *  +  * 

J(0,  i)  di  So  +  Sji  +  6a23 

are  related  to  via  the  relations  (compare  eqn  (3),  p  39  of  Elderton/Johnson  (1969)): 

a  =  ay/Tdfo  +  3)/r  T  =  2(5^3  -  6£,  -  9) 

(16)  So  = -o2(4^2-3/?l)r  Si  =  -a  S3  = -(2da -3^ -6)/r. 

Since  we  are  working  with  raw  moments,  we  make  the  transformation  x  =  x  -  n \  in  equation 

(16) ,  obtaining 

(17)  =  ^ - 

f($,x)dx  (So  -  ill* I  +  hp?)  +  (61  -  2S3^’i)x  +  S2I2 


60  +  Siz  +  S213 


if  we  set 


(19)  a  —  d  —  (i\  S0  =  So  —  S  V|  +  S 1  =  5 1  —  25^/*  j  S2  =  Sj 

Equations  (19)  and  (16)  together  relate  (a,  So,  Si,  S3)  to  the  more  familiar  parametrisation 
(/ij,ff3,#i,i?2).  The  virtue  of  the  (a, So, Si, S3)  form  is  that  it  yields  a  recurrence  relation  for  the 
raw  moments  of  a  Pearson  curve  (Elderton/Johnson,  pp  37-38): 


(20) 


[(n  +  2 )S3  +  l]jiJ,+1  =  -fu  +  (n  +  1  )Si )(i'n  -  nb0(i‘n_l 


which  we  will  use  for  n  +  1  =  2, . . . ,  8. 


§§2.2  Higher  moments  of  the  Pearson  type  IV  family  •  an  alternative  method  (not 
implemented) 

The  higher  (theoretical)  moments  of  a  Pearson  family  can  be  computed  via  recurrence  relations 
such  as  those  given  on  pp  37-38  of  Elderton/ Johnson  (ISCCj.  These  relations,  as  we  have  seen  require 
determination  of  a,bo,l>i,l>2  in  terms  of  the  input  parameters  t*,<r3 ,  fli,  As  a  check  on  these 

calculations,  one  can  use  the  simpler  recurrence  relation  (for  the  Type  IV  family)  given  on  p  64  of 
Elderton/ Johnson,  namely 


(21) 


An  = 


- — r{(n  -  l)aAn-2  - 

r  —  n  +  1 


which  is  expressed  in  terms  of  the  parameters  a,i/,r  that  are  obtained  directly  from  er2,di,/?2 
The  catch  is  that  the  moments  /in  are  calculated  relative  to  an  origin  located  va/r  to  the  right 
of  the  mean.  Thus,  to  obtain  the  raw  moments  /ij,  (about  the  original  origin),  a  transformation 
is  necessary.  If  i  has  origin  ua/r  to  the  right  of  the  mean,  then  it  is  related  to  the  original  co¬ 
ordinate  ibyi  =  i-  /j-j  =  i-  j  say.  We  are  interested  in 


tTkv  =  Cov(X*,X*')  -  Cov((X  +  *)*, (X  +  *)*  ) 

i  i' 


\k'\ 


Thus  if  we  set  E  =  (ok*<)«x4,  -  =  =  (aty)4X4  (where  ak}  =  (J)r*  ’  I (k  >  j)). 

then  we  can  write  the  above  in  matrix  form  as 


$  =  A^A*. 

Here  %  is  known  easily,  since  (7;j<  =  p;+;>  -  which  may  be  read  off  from  (21)  above.  Matrix 

multiplication  routines  can  then  be  used  to  obtain  $. 


§§2.3  Inputs  to  numerical  Integration  routines  -  Type  TV  Pearson  curves 

Here  we  suppose  that  V  has  the  type  IV  Pearson  density  (in  the  form  given  by  Elder- 
ton/Johnson  (1069  p  '.S)) 

i3  ±A 

f I  1  >•  =  r(l  f  —)  »  exp(-t/tan  ‘(i/a)) 


(22) 


-  cc  <  x  <  x 


_  6(02  -  0i  -  1) 

102  —  30i  —  6 

„  =  r(r-2)\Z?T 

A 


A={l6(r-l)-/Mr-2)*}‘/a 

Note  that  the  origin  is  vajr  after  the  mean  p\  and  that  the  argument  x  is  referred  to  this  origin. 
Note  also  that  o2,0i, 02  are  the  usual  variance  (square'!'!  skewness  and  kurtosis,  given  in  terms  of 
central  moments  nu  by  413,  and  /14 //i|  respectively.  The  normalization  constant  c  depends 

on  a2, 0i, 02,  and  will  be  calculated  in  the  numerical  integration  routine. 

As  demonstrated  on  p  64  of  Elderton/Johnson  (1969),  cumulative  probabilities  for  the  density 
f(0,x)  may  be  most  easily  computed  after  the  changes  of  variables  toa$  —  xja  and  ;r/2  —  fi  +  <j>, 
leading  to  the  equivalent  expressions 

F(0,io)=  r  f(0,x)dx 
J  —  00 

r*  0  /■*/  2 

(24)  =  /  cos T(6)t~v,d0l  I  cos T8t~v*d0  ,  ^0  —  tan  1(io /a) 

Jw/2  J-w/2 

—  J  sin r{<i>)e‘'^~w^d<t>/  J  sin T  ,  ^0  =  j  -  tan_1(x0/a) 

=  I(<(>o,  r,  v)/ 1(0,  r,  1/). 

If  the  mean  is  allowed  to  be  arbitrary  (as  we  do  here  to  permit  flexibility),  then  an  initial 
change  of  co-ordinates  is  required.  Given  parameters  0  =  (p,(T2 , 0i,  02),  we  obtain 

F(0,x)  =  PB(X  <  x) 

=  P(X-(»+y)<X-(p+^)) 

(25)  =F(0,i) 

where  x  —  x  -  (p  +  va/r).  Differentiating,  we  get 

(26)  f(0,x)  =  f(0,x). 


‘V*  S  "V  V  **  S'  ~S  'j-  *  *  /  V  *y»  *  J-  “  j-  ■ 


y/vY/vvWV/.A 


rWtWIVCTOCT  VV  V  V  \T  *  wA"-TVVV^A^r->^ 


r^^jr»x*^^]H 
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§3.  Specific  details  of  the  program 

As  described  in  the  theoretical  development  of  the  previous  section,  the  basic  goal  of  the 
program  is  the  calculation  of 

(27)  Vsr^(x«(^))  - 

(combine  formulas  (7)  and  (12)). 

The  labels  A-D  correspond  to  labels  in  the  program  comments. 

A:  a)  The  program  requests  input  parameters: 

d  =  {p,02,0i,fa)  -  parameters  describing  the  Pe arson- type  IV 

distribution  which  is  assumed  to  generate 
the  data.  (There  is  a  check  that  they  are  in  the 
type  IV  region.) 

naamp  -  sample  size  available  for  estimating 


e=upper  percentile  -  entering  say,  e  =  .005  would  ask  for  the 

upper  1/2 %  point  of  the  specified  type  IV  curve, 
b)  The  zero-finding  subroutine  is  used  to  find  the  e,k  quantile  zo,  solving 

Fp{x 0)  =  e. 

B:  This  section  is  largely  self-explanatory,  assembling  components  for  the  basic  calculation.  Com¬ 
putation  of  D§  and  DH  is  straightforward,  but  some  care  with  co-ordinate  systems  is  needed 
in  evaluation  of  pk,k  =  1, . . .  ,8;  the  raw  moments  of  the  theoretical  type  IV  distribution.  The 
p'k  are  found  by  a  recurrence  relation  (20)  in  the  original  co-ordinate  system.  To  express  the 
coefficients  0,60,61,63  in  (20)  in  terms  of  the  input  data  p,<r2,/3 i,da.  a  detour  is  made  through 
a  co-ordinate  system  referred  to  the  mean  p  (see  formulas  (15)-(19)). 


'..V« 

m 

Vv-'v' 

-  w 

V-'/i 


/.V.V, 

«\W-\ 
**  «*, 


•y/v 

>&- 

"La  * 

P  ^ 


•  *  k' 

.A-’. 


v*y.> 


*  -  *  •  v 


C.  This  section  performs  the  numerical  differentiation  (rather  crudely!).  Here  is  the  principle  used: 
imagine  a  real  valued  function  g(z\, . . . ,  zr).  The  i**  partial  derivative  at  z°  =  z®)  is 

estimated  as 

dg  .  0,  ^  g(z°  +  g,e,)  -  g(r°  -  gte,) 
dzi  *  e»  +  ef 

Here  e,  =  (0, . . . ,  1, . . .  ,0)  is  the  i,h  unit  co-ordinate  vector,  and  st,  e,  >  0  are  non-negative, 
not  necessarily  equal,  and  not  both  zero. 


mrm 


vWS' 

•w 

OV\rs 


Checks  are  performed  that  the  arguments  z°  +  ere,-,  z°  -  ete,-  lie  within  the  allowable  Pearson 
type  IV  range,  if  they  do  not,  then  they  are  adjusted  so  that  they  do. 

The  size  of  the  steps  «»,«,  are  governed  (in  the  absence  of  the  restrictions  mentioned  above) 
by  a  small  parameter  S,  and  the  size  of  the  argument.  Thus,  in  approximating  the  partial 
derivative  at  z° 

er  =  max(Mzf). 

D.  Final  evaluation. 

The  matrix  multiplications  are  done  using  IMSL  routines  for  computing  AB,ABt,AtB.  (doc¬ 
umentation  attached).  The  final  step  is  to  compute  the  denominator  in  (27):  the  population 
density  at  the  quantile,  and  this  is  done  using  formula  (22).  Note  that  the  normalizing  constant 

e  =  -1(0,  r,v) 

Q 

(see  notation  of  subsection  2.3). 

OUTRANGE  (fiufiz) 

This  subroutine  (which  is  invoked  both  at  initial  input  of  parameters,  and  when  parameters 
are  modified  during  the  computation  of  numerical  derivatives)  checks  whether  lie  in  the  Type 

IV  region  (or  the  type  III  boundary  corresponding  to  the  t-family,  when  fi\  =  0).  Thus  the  routine 


checks  if 

(i)  fix  >  0,  fi2  <  3 

(ii)  0<*<1,  where  k  =  5^357 -«) 

(A  picture  of  the  region  defined  by  (i)  and  (ii)  is  given  in  Volume  2  of  the  Biomctrxka  Tables 
for  Statisticians). 


ZERO  {it,a2,fa,fa,e,x0) 

This  subroutine,  for  given  Pearson  curve  parameters  n,o2,$i,$ a,  and  given  tail  probability  s, 
solves  for  the  corresponding  quantile  zo  such  that 

Ffi(xo)  -  ^  *o)  =  «• 

It  uses  the  IMSL  routine  ZBRENT,  which  uses  (I  think)  a  combination  of  bisection,  linear 
and  inverse  quadratic  interpolation  to  find  the  zero  of  an  equation  •  in  this  case,  the  equation 
Ffi{x)  -  e  =  0. 

This  equation  clearly  has  a  unique  root,  and  ZBRENT  requires  two  initial  values  LEFT  and 
RIG  HT,  which  must  be  of  opposite  sign.  Somewhat  arbitrarily,  the  present  code  takes 

LEFT  =/i 
RIGHT  =p+12  a 

on  the  assumption  that  e  will  be  chosen  so  that  we  are  working  in  the  right  tail  of  the  distribution, 
and  that  12  standard  deviations  to  the  right  of  the  mean  should  leave  a  smaller  remaining  tail  area 
than  any  conceivably  realistic  choice  of  e. 

Of  course,  ZBRENT  repertedly  calls  and  evaluates  the  function  P\DF  -  the  cumulative  dis¬ 
tribution  function  of  a  Pearson  type  IV  curve. 

PADF(»,o2,fa,fa,x) 

This  function  computes  the  c.d.f.  of  a  random  variable  X  following  a  Pearson  type  IV  curve 
described  by  parameters  —  [p,o2,fa,fa)  and  returns  the  value  Fp(x)  —  P$(X  <  x). 

Subsection  2.3:  “Inputs  to  numerical  integration  routines  :  type  IV  Pearson  curves”  describes 
the  form  in  which  the  type  IV  density  is  used  in  the  numerical  integration.  The  actual  calcula¬ 
tion  is  done  by  the  IMSL  routine  DMLIN  (see  documentation  attached)  which  employs  Gaussian 
integration  separately  on  the  numerator  and  denominator  in  the  expression  (24) 

F(fax°)  =  I(<t>o,r,  v)/I(0,  r,  v). 
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c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


program  quantile 

real*8  mp(8) 
real *  8  mu(4) 

real  *8  mm,  sig2,  betal,  beta2,rhs,x0 

real*8  dh(4,4) 

real*8  dg(4,4) 

real *8  r,nu,a,del 

real *8  aa,b0,bl,b2 

real *8  num, den  an 

real*8  cov(4,4) 

real  *8  gam,  abar,bObar,  bibar  ,b2bar 
real* 8  nsamp 

declarations  for  matrix  multiplication,  numerical  derivatives 
and  calculation  cf  density 

real* 8  tl(4,4),  t2(l,4),  t3(l,4),  t4 ( 1 , 1 ) 

real* 8  delta,  epsl,epsr,  df(4,l),p4df,small 

integer  l,m,n,ia,ib,ic,ier 
integer  maxf cn , nn , iier 

real*8  dmlin,low(l)  ,up(l),aerr,rerr,integ 
real*8  pi,piby2,x, temp,dens 
external  f 

logical  outrange ,  long 
common  r,nu,piby2 
character  letter 

data  dh/16*0.0/ 
data  dg/16*0.0/ 

these  computations  would  be  needed  were  we  to 
read  in  raw  moments  muprime(l:4)  > 

< compute  first  four  centered  moments > 

mu(l)  -  mp(l) 

mu(2)  -  mp(2)  -  mp(l)**2 

mu(3)  -  mp(3)  -  3*mp( 2) *mp(  1)  +  2*mp(l)**3 

mu(4)  -  mp{4)  -  4*mp(3)*mp(l)  +  6*mp(2)*mp(l)**2  -  3*mp(l)**4 

< compute  mean  variance  betal,beta2 

non  -  mu(l) 
sig2  -  mu(2) 

betal  -  mu(3)**2  /  mu(2)**3 
beta2  -  mu(4)  /  mu(2)**2 

print  *,  "long  output?  " 

read  *  ,  letter 

long  -  letter  .eq.  'y' 

write  (6,13)  "mean" , "var" , "betal" , "beta2" , "nsamp" , "eps" , "xO" , 
♦  "dens" , "s.d. p 

do  10,  iter  -  1,10000 
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c  ************** 

c  SECT  A.  :  read  in  mean ,  variance ,  betal ,  beta 2 ,  xO,  upper  %tile 

if  (long)  print  *,  "input  mean, var, betal , beta 2, sampsize, upper  %tile 
read  (5,*,end  -15  )  mm,sig2, betal ,beta2,nsamp,rhs 
if  (long)  print  *,  mm, sig2 , betal , beta2 , nsamp , rhs 

if  ( outrange ( betal, beta2 ) )  print  *,  "warning:  parms  outside  typerv  " 
c  now  find  quantile  xO  yielding  tail  probability  rhs 

call  zero(n*n,  s  ig2,  betal,  beta2,  rhs,  xO) 
c  ********************** 

c  SECT.  B.  :  calculate  centered  moments  thru  order  four 

c  using  formula  (8) 

mu(l)  -  ran 
mu(2)  -  sig2 

mu(3)  -  sqrt (  betal  *  mu(2)**3  ) 
mu(4)  -  beta2  *  mu(2)**2 
if  (long)  print  *,  "mu  -  " 
if  (long)  write  (6,44)  (mu(i) ,i-l,4) 

44  format  (4F14.6) 

c  calculate  raw  moments  thru  order  four  using  formula  (11) 

mp(l)  -  mu(l) 

mp(2)  -  mu(2)  +  mp(l)**2 

mp(3)  -  mu(3)  +  3*mp(l)*rap(2)  -  2*mp(l)**3 

mp(4)  -  mu(4)  +  4*mp(3)*mp(l)  -  6*mp( 2 ) *mp( 1) **2  +  3*mp(l)**4 

c  compute  entries  in  Jacobian  matrix  for  H  using  formula  (13) 

dh(l,l)  -  1. 

dh(2,2)  -  1. 

dh(3,3)  -  1. 

dh(4,4)  -  1. 

dh(2,l)  -  -  2*mp(l) 

dh(3,l)  -  6*mp(l)**2  -  3*mp(2) 

dh(  3, 2)  -  -  3*mp(  1) 

dh(4,l)  -  (-4*mp(3))  +  12*mp(2)*mp(l)  -  12*mp(l)**3 
dh(4 , 2)  -  6*mp(l)**2 
dh(4,3)  -  -4*mp(l) 

if  (long)  print  *,  "dh 

if  (long)  write  (6,140)  ( (dh(i, j ) , j-1,4 ) , i-1,4 ) 

140  format  (/(4F14.6)  ) 

c  compute  entries  in  Jacobian  matrix  for  G  using  formula  (10) 

dg ( 1 , 1 )  -  1. 
dg  ( 2 , 2 )  -  1. 

dg ( 3 , 2)  -  ( -3*mu( 3 ) **2  )  /  (mu(2)**4  ) 
dg ( 3 , 3 )  -  (  2*mu( 3 )  )  /  (mu(2)**2) 

dg(4,2)  -  (  -2*mu(4)  )  /  (  mu(2)**3  ) 


u  o 
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c 


c 


c 

c 


c 


20 

80 

c 


40 

30 


dg(4,4)  -  1./  (mu(2) **2  ) 
if  (long)  print  *,  "dg 

if  (long)  write  (6,140)  ( (dg(i, j ) , j-1,4 ) ,i-l,4 ) 


■*.v  vl 


compute  r,nu,a,del  -  parameters  in  type  IV  density  using  formula  (A. 2) 

r  -  (6.*(beta2  -  betal  -  1.))  /  (  2.*beta2  -  3.*betal  -  6.) 
del  -  sqrt(  16.*(r-l.)  -  betal* (r-2. )**2  ) 
nu  -  (  (-r)*(r-2. )*sqrt(betal)  )/del 
a  -  sqrt(sig2)  *  del  /  4. 

check  to  see  that  8th  moment  exists  (it  often  doesn't) 


j-vv 


if  (  r  .le.  7)  then 

print  *  , "  8th  moment  infinite  :  abort  " 
stop 

end  if 

set  to  work  on  higher  moments  via  the  recurrence  relations 
begin  in  scale  centered  on  mean  -  see  formulas  (16) 


gam  -  2.*(5*beta2  -  6*betal  -  9.) 
abar  -  sqrt(sig2  *  betal)  *  (beta2  +  3.)  /gam 
bObar  -(-  sig2)*(  4*beta2  -  3*betal  )  /  gam 
bibar  -  -  abar 

b2bar  -  -  (  2* beta 2  -  3 * betal  -  6.)  /  gam 

now  convert  to  absolute  origin  using  formulas  (19) 

aa  -  abar  -  mp( 1) 

bO  -  bObar  -  blbar*mp(l)  +  b2bar*mp(l)**2 
bl  -  bibar  -  2*b2bar*mp(l) 
b2  -  b2bar 

now  set  up  recurrence  relation  using  formula  (20)  : 
note  mp(l)  through  mp(4)  were  calculated  earlier 

do  20,  i  -  5,8 

num  -  (  aa  +  i*bl  )*mp(i-l)  +  (i-1) *b0*mp(i-2) 

denom  -  (i+l)*b2  +  1. 
mp(i)  -  (~num)/denora 

continue 

if  (long)  print  *,  "mp  -  " 
if  (long)  write  (6,80)  (mp( i) , i-1, 8) 
format  (4fl4.6) 

set  up  covariance  matrix  of  raw  sampling  moments  (cf.  formula  (14) 

do  30,  i  -  1,4 

do  40,  j  -  1,4 

cov ( i , j )  -  (  mp(i+j)  -  mp ( i ) *mp ( j )  )/nsamp 

continue 

continue 

if  (long)  print  * ,  "cov 

if  (long)  write  (6,140)  ( (cov( i , j ) , j-1, 4 ) , i-1 , 4 ) 
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c  ************************ 

c  SECT.  C.  :  It's  time  to  compute  df  :  numerical  derivatives  of 

c  Pearson  cdf  vrt  mean , var , betal , beta  2 


delta  -  .1 

epsr  -  max(  delta,  am  *  delta  ) 
epsl  -  epsr 

df(l,l)  -  (  p4df(n®+epsr/sig2, betal, beta2,x0)  - 

p4df (mm-epsl, sig2, betal, beta2,x0)  )/  (epsl+epsr) 


epsr  -  max(  delta,  sig2  *  delta  ) 
epsl  -  epsr 

if  (epsl  .gt.  .25*sig2  )  epsl  -  . 25*sig2 
df(2,l)  -  (  p4df(n«n,  sig2+epsr,  betal,  beta2,x0) 

p4df (nan, sig2-epsl, betal, beta2,x0)  )/  (epsl+epsr) 


small  -  l.d-7 
do  1001  ,1-1,8 

epsr  -  max(  delta,  betal  *  delta  ) 
epsl  -  epsr 

if  (  outrange (betal+epsr, beta 2)  )  epsr  -  0. 
if  (  outrange (betal+epsl,beta2)  )  epsl  -  0. 
if  (  (epsl+epsr)  .gt.  small  )  goto  1002 
if  (  i  . eq.  8  )  then 

print  *,  "  unfixable  epsl+epsr  -  0  in  del (betal)  :  aborted 
stop 
end  if 

delta  -  delta/2. 

1001  continue 


1002  df(3,l)  -  (  p4df (mm, sig2, betal+epsr, beta2,x0)  - 

+  p4df (mm,sig2,betal-epsl,beta2,x0)  )/  (epsl+epsr) 

delta  -  . 1 
do  1011  ,  i  -  1,8 

epsr  -  max(  delta,  beta2  *  delta  ) 
epsl  -  epsr 

if  (  outrange (betal, beta 2+epsr)  )  epsr  -  0. 
if  (  outrange (betal, beta2-epsl)  )  epsl  -  0. 
if  (  (epsl+epsr)  .gt.  small  )  goto  1012 
if  (  i  .eq.  8  )  then 

print  *,  "  unfixable  epsl+epsr  -  0  in  del(beta2)  :  aborted 
stop 
end  if 

delta  -  delta/2. 

1011  continue 


1012  df(4,l)  -  (  p4df (mm, sig2, betal, beta2+epsr,x0) 

+  p4df(wn,sig2, betal, beta2-epsl,x0)  )/  (epsl+epsr) 


if  (long)  print  *,  "  df  -  " 

if  (long)  write  (6,120)  ( (df (i, j ) , i-1,4 ) , j-1, 1) 
120  format  (/(4fl4.6)  ) 


19 
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c  SECT  D.  -  matrix  multiplications: 


c 

tl 

-  dg  *  dh 

c 

t2 

-  t(df )  *  tl 

c 

t3 

-  t2  *  cov 

c 

t4 

-  t3  *  t(t2) 

1-4 
m  -  4 
n  -  4 
ia  -  4 
ib  -  4 
ic  -  4 

call  vmulff (dg,dh,l,m,n,ia,ib,tl,ic,ier) 
if  (  ier  .ne.  0  )  print  *  ,  "ier  -  ",  ier 

1-4 
m  -  1 
n  -  4 
ia  -  4 
ib  -  4 
ic  -  1 

call  vmulfm( df ,  tl ,  1 , m,  n ,  ia ,  ib, t2 ,  ic ,  ier ) 
if  (  ier  .ne.  0  )  print  *  ,  "ier  -  ",  ier 

1-1 
m  -  4 
n  -  4 
ia  -  1 
ib  -  4 
ic  -  1 

call  vmulff (t2,cov,l,m,n, ia,ib,t3,ic, ier) 
if  (  ier  .ne.  0  )  print  *  ,  "ier  -  ",  ier 

1-1 
m  -  4 
n  -  1 
ia  -  1 
ib  -  1 
ic  -  1 

call  vmul f p ( t3 ,  t2 , 1 ,  m,  n ,  ia ,  ib ,  t4 ,  ic ,  ier ) 
if  (  ier  .ne.  0  )  print  *  ,  "ier  -  ",  ier 

if  (long)  print  *,  "  tl  -  " 

if  (long)  write  (6,140)  ((tl(i, j), j-1,4) ,i-l,4) 

if  (long)  print  *,  "  t2  -  " 

if  (long)  write  (6,120)  (t2(l,j), j-1,4) 

if  (long)  print  *,  "  t3  -  " 

if  (long)  write  (6,120)  (t3(l,j), j-1,4) 

if  (long)  print  *,  "  t4  -  " 
if  (long)  write  (6,140)  t4(l,l) 

c  compute  density  at  xO,  obtaining  dens  ,  using  formula  (A.l) 


result  -  t4(l,l)/  (dens* *2) 
if  (long)  then 

print  *,"x0-  and  has  asy 

else 

write  (6,12)  mm,s±g2,betal,beta2 

end  if 

format  (1H  ,4F7.2,1H  ,5G10.3el) 

format  ( 4A7 , 1H  , 5A9  ) 

continue 

print  *  ,  " 
end 
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real *8  function  f(n,x) 
real* 8  x(n) ,r ,nu,plby2 
common  r,nu,piby2 

f  -  sin(x(l))**r  *  exp(  (x(l)-piby2)*nu  ) 

return 

end 


logical  function  outrange  ( betal ,  beta  2 ) 

c  checks  to  see  if  the  skewness  and  kurtosis  lie  within  the  range 

c  allowed  for  the  Pearson  type  IV  system:  check  is  based  on  the  kappa 

c  criterion  given  in  Elderton/ John  son  p  45 

real *8  betal ,  beta  2 , num,  den  am, ratio 

num  -  betal* (beta 2  +  3.)  **  2 

den  cm  -  4 . * (  4 . *beta2  -  3 . *betal )  *  (  2 . *beta2  -  3 . *betal  -  6 . ) 
ratio  -  num/denotn 

outrange  -  betal  .It.  0.  .or.  ratio  .ge.  1.  .or.  ratio  .It.  0. 

+  -or.  beta2  .It.  3.0 


return 
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real*8  function  p4df (mm, sig2, betal, beta2,x0) 

real*8  mm, sig2, betal ,beta2,x0 

real*8  r,nu,a,del 

integer  maxfcn,n,ier 

real*8  dmlin,low(l) ,up(l) ,aerr,rerr 

real*  8  pi , piby 2 , phi , x 

real *  8  num , denom 

common  r,nu,piby2 

external  f 

n  -  1 

rerr  -  1 .d-7 
maxfcn  -  10000 

c  compute  r,nu,a,del  -  parameters  in  type  IV  density 

r  -  (6.*(beta2  -  betal  -  1.))  /  (  2.*beta2  -  3.*betal  -  6.) 
del  -  sqrt(  16.*(r-l.)  -  betal* (r-2. )**2  ) 
nu  -  (  (-r)*(r-2. )*sqrt(betal)  )/del 
a  -  sqrt(sig2)  *  del  /  4. 

c  print  * ,  "r  -  H,r,"nu-  ",nu,"a-  ",a 

c  transform  origin  to  canonical  form  ,  and  compute  lower  limit  phi 

pi  -  4 .  *  atan ( 1 - ) 
piby2  -  pi/2. 

x  -  xO  -  (  mm  +  (nu*a/r)  ) 
phi  -  piby2  -  atan(x/a) 

up(l)  -  phi 
low( 1 )  -  0 . 

num  -  dmlin(f ,lcw/ up, n, maxfcn, aerr, rerr, ier) 
if  (  ier  .ne.  0  )  then 

print  *,  "  ier  -  "  ,  ier 

end  if 
up(l)  -  pi 

denom  -  dmlintf/low^p^maxfc^aerr/rerr/ier) 
if  (  ier  .ne.  0  )  then 

print  *,  "  ier  -  "  ,  ier 

end  if 

p4df-  num/denom 

return 

end 
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subroutine  zero(mm, sig2 , betal , beta2 , rhs , xO ) 

integer  nsig, maxfn, ieer 
real* 8  eps, left, right, funct 
real*8  mn,sig2, betal, beta2, rhs, xO 
real*8  tmm,tsig2, tbetal, tbeta2, trhs 
external  funct 

coninon  /brent/  tmm,tsig2, tbetal, tbeta2, trhs 

tsig2-sig2 

tbetal-betal 

tbeta2-beta2 

trhs-rhs 

eps  -  0.0 
nsig  -  5 
left  -  mm 

right  -  mm  +  12. *sqrt(sig2) 
maxfn  -  200 

call  zbrent(  funct, eps, nsig, left, right , maxfn, ieer  ) 
xO  -  right 

c  print  *,  "  zero  is  at  ",  xO 

end 


real *8  function  funct (y) 

real*8  turn, tsig2 , tbetal , tbeta2 , trhs , y , p4df 
common  /brent/  tmm,tsig2, tbetal, tbeta2, trhs 

funct  -  p4df (tmm, tsig2, tbetal, tbeta 2, y)  -  trhs 
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